課程名稱 |
資訊檢索 INFORMATION RETRIEVAL |
開課學期 |
95-1 |
授課對象 |
學程 知識管理學程 |
授課教師 |
唐牧群 |
課號 |
LIS4012 |
課程識別碼 |
106 47000 |
班次 |
|
學分 |
3 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期二2,3,4(9:10~12:10) |
上課地點 |
普402 |
備註 |
知識管理學程資源領域選修課程。 總人數上限:80人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/951ir |
課程簡介影片 |
|
核心能力關聯 |
核心能力與課程規劃關聯圖 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
Course description
The course is designed to provide an introduction to the use, design and
evaluation of information (IR) systems. It covers major components in the IR
process such as information needs, search strategies, indexing as well as
retrieval evaluation. Special attention will be given to users’ information
environment within which IR is situated.
|
課程目標 |
Course objectives
1. To develop knowledge and skills to conduct effective online retrieval.
2. To acquire a basic understanding of the inner working of information
retrieval systems and the knowledge to assess their functions and features.
3. To be aware of the relationships between different types of search tasks and
search tactics.
4. To gain hand-on experiences in building a digitalized collection.
|
課程要求 |
|
預期每週課後學習時數 |
|
Office Hours |
另約時間 |
指定閱讀 |
|
參考書目 |
References
Readings
Hersh ,William R. (1996) Information retrieval : a health and biomedical
perspective: Springer-Verlag New York, Inc.
Belew, Richard K. (2000). Finding out about: a cognitive perspective on
search engine technology and the WWW. Cambridge: Cambridge University Press.
Evaluation of Web-Based Search Engines Using User-Effort Measures. Available
online: http://libres.curtin.edu.au/libres13n2/tang.htm
PubMed
PubMed tutorials, available at http://www.nlm.nih.gov/bsd/disted/pubmed.html
PubMed help Available at
http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helppubmed/pubmedhelp.pdf
Greenstone
The software can be downloaded at
http://www.greenstone.org/cgi-bin/library?e=p-en-home-utfZz-8&a=p&p=download
Manuals (the “User’s guide” is most relevant to our purpose)
http://greenstone.sourceforge.net/wiki/index.php/Manual
Ian H. Witten, David Bainbridge (2003). How to Build a Digital Library,
Amsterdam : Morgan Kaufmann Publishers. |
評量方式 (僅供參考) |
No. |
項目 |
百分比 |
說明 |
1. |
Group project: IR evaluation |
30% |
Students will form into a group of 4 to 5 to carry out the project. Each group will conduct an IR evaluation comparing 3 major Web-based search engines on two search topics.
a. To obtain the search topics, interview two users (preferably graduate students or faculty members), each on one research topic they are interested in. Collect from each user: a search statement and associated query terms that you both agree best represent her information need.
b. For each search topic, submit the queries on the user’s behalf to the three search engines you are testing. Collect the first 20 links from each of the three returned sets.
c. Find out the degree of overlap among the three returned sets.
d. Mix the non-duplicative (20X3, maximum) links together and strip the graphic cues. This is done so that the user will not be able to tell which search engine each link is from.
e. For each link, marks its original and rank position.
f. Present the URLs in Microsoft Word files that allow the users to examine the actual webpage by clicking on its hyperlink. Ask them to judge the relevance of the pages based on a 0-4 scale (0 stands for not relevant at all; 4, very relevant).
g. Create an EXCEL or SPSS data file to input the relevance scores.
h. Compare the performance of the search engines based on 1) first 20 "full" precision and 2) search length 2 (i.e. the number of links the user has to go through to find two relevant documents)
i. Turn in an 8 page written report on your findings and present them in the class.
|
2. |
In-class quiz |
10% |
The quiz is based on the lecture notes and handouts. The quiz will be held in class on November 21st ; there will be 4 short questions in the quiz. |
3. |
Online tutorial |
10% |
1. Search feature/command demo (accounts for 10% of your final grade)
Students will work in pairs to create and present a video demo that explains a search tactics or function available at PubMed database.
|
4. |
Group project: Digital library |
40% |
Students will form into a group of 4 to 5 to carry out the project.
Each group will build a functional online information retrieval system collaboratively using the Greenstone digital library (GSDL) open source software.
The project consists of three components: the implantation of a digital collection on the topic of your own choosing, a 10-15 page written report and an oral presentation of the project at the end of the semester.
The digital collection should include:
a. A minimum of 50 documents representative of different document formats such as pdf, word, and html.
b. An index structure that enables browsing of the collection
c. The provision of fielded search
The written report should:
d. Explain the aim, purpose, intended users and their information needs of the collection. It is better that you come up with an institutional context (real or imaginary) for the use of the collection.
e. Define your selection and indexing policies based on the aim and purpose stated above.
f. Include a graphic presentation of the browsable index structure and the rationales behind your design (i.e. explain why you choose certain facets/attributes to represent your collection)
|
5. |
Class participation |
10% |
Attendance to all class sessions is mandatory. Your grade will be judged based on you attendance and participation in the class discussion. If you don’t get the chance to participate in the class, submit your comments or questions to the online forum. |
|
週次 |
日期 |
單元主題 |
第1週 |
9/19 |
Introduction to the syllabus; information retrieval in the broad context of human information seeking. |
第2週 |
9/26 |
Introduction to PubMed
Demo to Camtasia
|
第3週 |
10/03 |
Search tactics and strategies |
第4週 |
10/10 |
No class |
第5週 |
10/17 |
Relevance; evaluation and performance criteria
Demo to GSDL (Greenstone digital library)
*Turn in the search topics for your evaluation project (including search statement and query terms)
|
第6週 |
10/24 |
Indexing: machine vs. human |
第7週 |
10/31 |
PubMed demo presentation |
第8週 |
11/07 |
Types and structures of vocabularies
*Turn in the topic for your final project (the aim, scope and intended users of your collection)
|
第9週 |
11/14 |
Indexing policy: specificity and exhaustivity |
第10週 |
11/21 |
IR models (partial vs. exact match): vector space, probability, Page Rank (cognitive authority)
In-class quiz
|
第11週 |
11/28 |
Web search
Google Syntax
|
第12週 |
12/05 |
IR evaluation presentation |
第13週 |
12/12 |
Federated search |
第14週 |
12/19 |
Interface design and usability |
第15週 |
12/26 |
Extension: Collaborative filtering, citation indexing, collaborative filtering, information visualization |
第16週 |
2007/1/02 |
Group project presentation |
第17週 |
1/09 |
Group project presentation |
|